Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Mol Biol Evol ; 40(5)2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-37140205

RESUMO

Gene loss is a prevalent source of genetic variation in genome evolution. Calling loss events effectively and efficiently is a critical step for systematically characterizing their functional and phylogenetic profiles genome wide. Here, we developed a novel pipeline integrating orthologous inference and genome alignment. Interestingly, we identified 33 gene loss events that give rise to evolutionarily novel long noncoding RNAs (lncRNAs) that show distinct expression features and could be associated with various functions related to growth, development, immunity, and reproduction, suggesting loss relics as a potential source of functional lncRNAs in humans. Our data also demonstrated that the rates of protein gene loss are variable among different lineages with distinct functional biases.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Perfilação da Expressão Gênica , Filogenia , Genoma
2.
PNAS Nexus ; 2(5): pgad141, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37181047

RESUMO

A plant can be thought of as a colony comprising numerous growth buds, each developing to its own rhythm. Such lack of synchrony impedes efforts to describe core principles of plant morphogenesis, dissect the underlying mechanisms, and identify regulators. Here, we use the minimalist known angiosperm to overcome this challenge and provide a model system for plant morphogenesis. We present a detailed morphological description of the monocot Wolffia australiana, as well as high-quality genome information. Further, we developed the plant-on-chip culture system and demonstrate the application of advanced technologies such as single-nucleus RNA-sequencing, protein structure prediction, and gene editing. We provide proof-of-concept examples that illustrate how W. australiana can decipher the core regulatory mechanisms of plant morphogenesis.

3.
Cell Host Microbe ; 30(8): 1124-1138.e8, 2022 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-35908550

RESUMO

Constitutive activation of plant immunity is detrimental to plant growth and development. Here, we uncover the role of a long non-coding RNA (lncRNA) in fine-tuning the balance of plant immunity and growth. We find that a lncRNA termed salicylic acid biogenesis controller 1 (SABC1) suppresses immunity and promotes growth in healthy plants. SABC1 recruits the polycomb repressive complex 2 to its neighboring gene NAC3, which encodes a NAC transcription factor, to decrease NAC3 transcription via H3K27me3. NAC3 activates the transcription of isochorismate synthase 1 (ICS1), a key enzyme catalyzing salicylic acid (SA) biosynthesis. SABC1 thus represses SA production and plant immunity via decreasing NAC3 and ICS1 transcriptions. Upon pathogen infection, SABC1 is downregulated to derepress plant resistance to bacteria and viruses. Together, our findings reveal lncRNA SABC1 as a molecular switch in balancing plant defense and growth by modulating SA biosynthesis.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , RNA Longo não Codificante , Arabidopsis/microbiologia , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , Doenças das Plantas , Imunidade Vegetal/fisiologia , Plantas/genética , RNA Longo não Codificante/genética , Ácido Salicílico
4.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34849565

RESUMO

Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.


Assuntos
Modelos Biológicos , Biossíntese de Proteínas , RNA , Transcrição Gênica , Animais , Genoma Humano , Humanos , Mamíferos/genética , Mamíferos/metabolismo , RNA/metabolismo , RNA Longo não Codificante/genética , Ribossomos/genética , Ribossomos/metabolismo , Transcrição Gênica/genética
5.
Nature ; 593(7860): 602-606, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33953397

RESUMO

MicroRNAs (miRNAs) have essential functions during embryonic development, and their dysregulation causes cancer1,2. Altered global miRNA abundance is found in different tissues and tumours, which implies that precise control of miRNA dosage is important1,3,4, but the underlying mechanism(s) of this control remain unknown. The protein complex Microprocessor, which comprises one DROSHA and two DGCR8 proteins, is essential for miRNA biogenesis5-7. Here we identify a developmentally regulated miRNA dosage control mechanism that involves alternative transcription initiation (ATI) of DGCR8. ATI occurs downstream of a stem-loop in DGCR8 mRNA to bypass an autoregulatory feedback loop during mouse embryonic stem (mES) cell differentiation. Deletion of the stem-loop causes imbalanced DGCR8:DROSHA protein stoichiometry that drives irreversible Microprocessor aggregation, reduced primary miRNA processing, decreased mature miRNA abundance, and widespread de-repression of lipid metabolic mRNA targets. Although global miRNA dosage control is not essential for mES cells to exit from pluripotency, its dysregulation alters lipid metabolic pathways and interferes with embryonic development by disrupting germ layer specification in vitro and in vivo. This miRNA dosage control mechanism is conserved in humans. Our results identify a promoter switch that balances Microprocessor autoregulation and aggregation to precisely control global miRNA dosage and govern stem cell fate decisions during early embryonic development.


Assuntos
Dosagem de Genes , Camadas Germinativas/metabolismo , MicroRNAs/genética , Proteínas de Ligação a RNA/genética , Ribonuclease III/genética , Animais , Regulação da Expressão Gênica no Desenvolvimento , Células Hep G2 , Humanos , Células K562 , Metabolismo dos Lipídeos/genética , Camundongos , Regiões Promotoras Genéticas , Iniciação da Transcrição Genética
6.
Methods Mol Biol ; 2254: 111-131, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33326073

RESUMO

While more than a hundred thousand long noncoding RNAs (lncRNAs) have been identified in human genome, their biological functions and regulation are largely elusive. Here we present AnnoLnc, a one-stop online annotation portal for human lncRNAs ( http://annolnc1.gao-lab.org/ ). As the first (and the most comprehensive) Web server to provide on-the-fly annotation for novel human lncRNAs, AnnoLnc exploits more than 700 data sources to annotate inputted lncRNA systematically, spanning genomic location, secondary structure, expression patterns, coexpression-based functional annotation, transcriptional regulation, miRNA interaction, protein interaction, genetic association, and evolution. Moreover, in addition to a user-friendly Web interface, AnnoLnc can also be integrated into existing pipelines by either a set of JSON-based web service APIs or a stand-alone version for Linux server.


Assuntos
Anotação de Sequência Molecular/métodos , RNA Longo não Codificante/genética , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Internet , Software , Interface Usuário-Computador
7.
Nat Commun ; 11(1): 3458, 2020 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-32651388

RESUMO

Single-cell RNA-seq (scRNA-seq) is being used widely to resolve cellular heterogeneity. With the rapid accumulation of public scRNA-seq data, an effective and efficient cell-querying method is critical for the utilization of the existing annotations to curate newly sequenced cells. Such a querying method should be based on an accurate cell-to-cell similarity measure, and capable of handling batch effects properly. Herein, we present Cell BLAST, an accurate and robust cell-querying method built on a neural network-based generative model and a customized cell-to-cell similarity metric. Through extensive benchmarks and case studies, we demonstrate the effectiveness of Cell BLAST in annotating discrete cell types and continuous cell differentiation potential, as well as identifying novel cell types. Powered by a well-curated reference database and a user-friendly Web server, Cell BLAST provides the one-stop solution for real-world scRNA-seq cell querying and annotation.


Assuntos
RNA-Seq/métodos , Software , Algoritmos , Aprendizado de Máquina , Transcriptoma/genética
8.
Nucleic Acids Res ; 48(W1): W230-W238, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32406920

RESUMO

With the abundant mammalian lncRNAs identified recently, a comprehensive annotation resource for these novel lncRNAs is an urgent need. Since its first release in November 2016, AnnoLnc has been the only online server for comprehensively annotating novel human lncRNAs on-the-fly. Here, with significant updates to multiple annotation modules, backend datasets and the code base, AnnoLnc2 continues the effort to provide the scientific community with a one-stop online portal for systematically annotating novel human and mouse lncRNAs with a comprehensive functional spectrum covering sequences, structure, expression, regulation, genetic association and evolution. In response to numerous requests from multiple users, a standalone package is also provided for large-scale offline analysis. We believe that updated AnnoLnc2 (http://annolnc.gao-lab.org/) will help both computational and bench biologists identify lncRNA functions and investigate underlying mechanisms.


Assuntos
Anotação de Sequência Molecular , RNA Longo não Codificante/química , RNA Longo não Codificante/metabolismo , Software , Animais , Evolução Molecular , Regulação da Expressão Gênica , Humanos , Camundongos , RNA Longo não Codificante/genética
9.
Nucleic Acids Res ; 48(D1): D1104-D1113, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31701126

RESUMO

With the goal of charting plant transcriptional regulatory maps (i.e. transcription factors (TFs), cis-elements and interactions between them), we have upgraded the TF-centred database PlantTFDB (http://planttfdb.cbi.pku.edu.cn/) to a plant regulatory data and analysis platform PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) over the past three years. In this version, we updated the annotations for the previously collected TFs and set up a new section, 'extended TF repertoires' (TFext), to allow users prompt access to the TF repertoires of newly sequenced species. In addition to our regular TF updates, we are dedicated to updating the data on cis-elements and functional interactions between TFs and cis-elements. We established genome-wide conservation landscapes for 63 representative plants and then developed an algorithm, FunTFBS, to screen for functional regulatory elements and interactions by coupling the base-varied binding affinities of TFs with the evolutionary footprints on their binding sites. Using the FunTFBS algorithm and the conservation landscapes, we further identified over 20 million functional TF binding sites (TFBSs) and two million functional interactions for 21 346 TFs, charting the functional regulatory maps of these 63 plants. These resources are publicly available at PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) and a cloud-based mirror (http://plantregmap.gao-lab.org/), providing the plant research community with valuable resources for decoding plant transcriptional regulatory systems.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Plantas/genética , Transcrição Gênica , Sítios de Ligação , Mapeamento Cromossômico , Evolução Molecular , Anotação de Sequência Molecular , Filogenia , Plantas/metabolismo , Ligação Proteica , Fatores de Transcrição/metabolismo , Navegador
10.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30339214

RESUMO

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with strong genetic contributions. To provide a comprehensive resource for the genetic evidence of ASD, we have updated the Autism KnowledgeBase (AutismKB) to version 2.0. AutismKB 2.0 integrates multiscale genetic data on 1379 genes, 5420 copy number variations and structural variations, 11 669 single-nucleotide variations or small insertions/deletions (SNVs/indels) and 172 linkage regions. In particular, AutismKB 2.0 highlights 5669 de novo SNVs/indels due to their significant contribution to ASD genetics and includes 789 mosaic variants due to their recently discovered contributions to ASD pathogenesis. The genes and variants are annotated extensively with genetic evidence and clinical evidence. To help users fully understand the functional consequences of SNVs and small indels, we provided comprehensive predictions of pathogenicity with iFish, SIFT, Polyphen etc. To improve user experiences, the new version incorporates multiple query methods, including simple query, advanced query and batch query. It also functionally integrates two analytical tools to help users perform downstream analyses, including a gene ranking tool and an enrichment analysis tool, KOBAS. AutismKB 2.0 is freely available and can be a valuable resource for researchers.


Assuntos
Transtorno do Espectro Autista/genética , Bases de Conhecimento , Predisposição Genética para Doença , Humanos , Internet , Anotação de Sequência Molecular , Interface Usuário-Computador
11.
Nucleic Acids Res ; 45(W1): W12-W16, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28521017

RESUMO

With advances in next-generation sequencing technologies, numerous novel transcripts in a large number of organisms have been identified. With the goal of fast, accurate assessment of the coding ability of RNA transcripts, we upgraded the coding potential calculator CPC1 to CPC2. CPC2 runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts. Moreover, the model of CPC2 is species-neutral, making it feasible for ever-growing non-model organism transcriptomes. A mobile-friendly web server, as well as a downloadable standalone package, is freely available at http://cpc2.cbi.pku.edu.cn.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Animais , Perfilação da Expressão Gênica , Humanos , Internet , Camundongos , RNA Longo não Codificante/química , RNA Mensageiro/química , Pequeno RNA não Traduzido/química
12.
Nucleic Acids Res ; 45(D1): D1040-D1045, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27924042

RESUMO

With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.


Assuntos
Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Plantas/genética , Plantas/metabolismo , Fatores de Transcrição/metabolismo , Sítios de Ligação , Biologia Computacional/métodos , Evolução Molecular , Genômica/métodos , Anotação de Sequência Molecular , Motivos de Nucleotídeos , Ligação Proteica , Navegador , Fluxo de Trabalho
13.
BMC Genomics ; 17(Suppl 13): 1023, 2016 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-28155723

RESUMO

BACKGROUND: The temporal and spatial-specific expression pattern of a transcript in multiple tissues and cell types can indicate key clues about its function. While several gene atlas available online as pre-computed databases for known gene models, it's still challenging to get expression profile for previously uncharacterized (i.e. novel) transcripts efficiently. RESULTS: Here we developed LocExpress, a web server for efficiently estimating expression of novel transcripts across multiple tissues and cell types in human (20 normal tissues/cells types and 14 cell lines) as well as in mouse (24 normal tissues/cell types and nine cell lines). As a wrapper to RNA-Seq quantification algorithm, LocExpress efficiently reduces the time cost by making abundance estimation calls increasingly within the minimum spanning bundle region of input transcripts. For a given novel gene model, such local context-oriented strategy allows LocExpress to estimate its FPKMs in hundreds of samples within minutes on a standard Linux box, making an online web server possible. CONCLUSIONS: To the best of our knowledge, LocExpress is the only web server to provide nearly real-time expression estimation for novel transcripts in common tissues and cell types. The server is publicly available at http://loc-express.cbi.pku.edu.cn .


Assuntos
Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/métodos , Software , Navegador , Algoritmos , Animais , Humanos , Camundongos , Transcrição Gênica , Transcriptoma , Interface Usuário-Computador , Fluxo de Trabalho
14.
Nucleic Acids Res ; 43(W1): W85-90, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25977299

RESUMO

In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions.


Assuntos
Genes Arqueais , Genes Bacterianos , Software , Algoritmos , Genes Essenciais , Genoma Arqueal , Genoma Bacteriano , Internet
15.
Zhonghua Jie He He Hu Xi Za Zhi ; 29(1): 31-4, 2006 Jan.
Artigo em Chinês | MEDLINE | ID: mdl-16638298

RESUMO

OBJECTIVE: To explore the application of serum surface-enhanced laser desorption/ionization (SELDI) marker patterns in distinguishing non-small cell lung cancer patients from healthy people by protein chip technology. METHODS: One hundred and sixty-three serum samples (123 patients with lung cancer and 40 healthy persons), were randomly divided into a training set [94 cases, 53 non-small cell lung cancer (NSCLC), 21 small cell lung cancer and 20 healthy persons] and a blinded test set (69 cases), were included for analysis by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS). Five protein peaks at 11,493, 6,429, 8,245, 5,336 and 2,536 were automatically chosen for the system training and the development of a decision classification tree model (marker pattern). The accuracy of the model was tested with the blinded test set (an independent set of masked serum samples from 49 patients with NSCLC and 20 healthy persons). RESULTS: The model differentiated the patients with NSCLC from the healthy people with a sensitivity of 95.9% (71/74) and a specificity of 90.0% (18/20) in the training set and a sensitivity of 83.7%, and a specificity of 80.0% in the blinded set respectively. CONCLUSION: SELDI-TOF-MS technique can correctly distinguish NSCLC patients from healthy people, and it has the potential for the development of a screening test for the detection of NSCLC.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/sangue , Neoplasias Pulmonares/sangue , Proteínas de Neoplasias/sangue , Análise Serial de Proteínas , Adulto , Idoso , Biomarcadores Tumorais/sangue , Carcinoma Pulmonar de Células não Pequenas/patologia , Estudos de Casos e Controles , Feminino , Humanos , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Análise Serial de Proteínas/métodos , Proteômica , Sensibilidade e Especificidade , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...